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Abstract 

Augmented reality is the art to seamlessly fuse virtual 
objects into real ones. In this short note, we address the op¬ 
posite problem, the inverse augmented reality, that is, given 
a perfectly augmented reality scene where human is unable 
to distinguish real objects from virtual ones, how the ma¬ 
chine could help do the job. We show by structure from 
motion (SFM), a simple 3D reconstruction technique from 
images in computer vision, the real and virtual objects can 
be easily separated in the reconstructed 3D scene. 

1. Introduction 

Augmented reality (AR) is meant to seamlessly fuse in¬ 
terested virtual objects into the real scene, and one of its ul¬ 
timate goals is to make the fusion imperceptible even by an 
experienced human being, and great efforts and progresses 
have been made on it in AR community in recent years [4] . 

In this work, we address the opposite problem: suppose 
one emerges in a perfectly fused AR environment where he 
is unable to distinguish which one is virtual, which one is 
real around him by his own visual system, how could he re¬ 
sort to a machine, for example, his mobile phone or a com¬ 
mon digital camera, to aid him to ascertain virtual from real 
ones? To our knowledge, this inverse problem seems un¬ 
addressed in the literature. Here we show by a SFM-based 
simple 3D reconstruction technique from images captured 
by his mobile phone camera, the job can be easily done. 

2. Inverse Augmented Reality 

Shape-from-Motion (SFM) is a widely used technique 
in 3D scene reconstruction from images in computer vi¬ 
sion field. By establishing point correspondences across 
several images and by some assumptions on the camera ’s 
intrinsic parameters, currently a dense metric scene recon¬ 
struction can be automatically obtained, for example, by 
Snavely’s Bundler [3] for sparse reconstruction, followed 
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by Furukawa ’s PM VS [1] for dense or quasi-dense recon¬ 
struction. 

In the following, we report a case to illustrate the basic 
principle and effects. Shui-Dong-Gou Archeological Mu¬ 
seum is located in the Ning-Xia-Hui Autonomous Region, 
North West China, it is an archeological site of human life 
and activities dating back to about 40,000 years. An exam¬ 
ple image of this museum is showed in Fig. 1(a). Currently 
its demo hall is in such an exploit of AR technology that or¬ 
dinary visitors can hardly tell which part of scene is virtual, 
which part is real. 

As shown in Fig. 1(a), we can hardly distinguish vir¬ 
tual parts from real scene, at least for the majority of the 
lay visitors. By taking about 20 images with a Canon 5D 
Mark III camera, and by Bundler PMVS2 combination, 
the AR scene is 3D reconstructed as shown in Fig. 1(b) and 
Fig. 1(c). Note that in Fig. 1(b) and Fig. 1(c), small colored 
cones are calibrated camera poses of the used images for 
our 3D reconstruction, and the Fig. 1(c) is the top-view of 
dense reconstruction. From Fig. 1(c), we can see that the 
sky marked by ® in Fig. 1(a) is in fact a vaulted backdrop, 
and the greensward marked by (2) in Fig. 1(a) is also in this 
backdrop. Stones marked by (3)(4) are real ones. In sum, 
virtual parts can be clearly separated from the real scene 
from the reconstructed 3D scene. 

3. Conclusion 

Before ending this short note, we would make the fol¬ 
lowing two points: (1) In this work, we only discuss a static 
AR scene, the principle and techniques can be extended 
straightforwardly to dynamic AR scene using a synchro¬ 
nized stereo system. For example, nowadays many mo¬ 
bile phones are equipped with a micro-array camera, which 
could capture several images at the same time. From such 
synchronized images, the same 3D reconstruction technique 
for the static scene can be used for dynamic scenes. (2) Hu¬ 
man 3D visual perception is not necessarily truly 3D in the 
physical sense although the machine and human use similar 
mechanism for disparity computation [2] . 


1 



(a) (b) (c) 


Figure 1. (a): an image of Shui-Dong-Gou Archeological Museum; (b): the dense reconstruction of Shui-Dong-Gou Archeological Museum 
produced by Bundler+PMVS2; (c): the top-view of the dense reconstruction scene in (b). Note that labels (I)@(3)@ in (a) mark some 
areas in the scene, and their corresponding reconstruction results are respectively showed in (b). 
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